72 research outputs found

    Pairwise Independent Random Walks Can Be Slightly Unbounded

    Get PDF
    A family of problems that have been studied in the context of various streaming algorithms are generalizations of the fact that the expected maximum distance of a 4-wise independent random walk on a line over n steps is O(sqrt{n}). For small values of k, there exist k-wise independent random walks that can be stored in much less space than storing n random bits, so these properties are often useful for lowering space bounds. In this paper, we show that for all of these examples, 4-wise independence is required by demonstrating a pairwise independent random walk with steps uniform in +/- 1 and expected maximum distance Omega(sqrt{n} lg n) from the origin. We also show that this bound is tight for the first and second moment, i.e. the expected maximum square distance of a 2-wise independent random walk is always O(n lg^2 n). Also, for any even k >= 4, we show that the kth moment of the maximum distance of any k-wise independent random walk is O(n^{k/2}). The previous two results generalize to random walks tracking insertion-only streams, and provide higher moment bounds than currently known. We also prove a generalization of Kolmogorov\u27s maximal inequality by showing an asymptotically equivalent statement that requires only 4-wise independent random variables with bounded second moments, which also generalizes a result of Blasiok

    Better and Simpler Lower Bounds for Differentially Private Statistical Estimation

    Full text link
    We provide improved lower bounds for two well-known high-dimensional private estimation tasks. First, we prove that for estimating the covariance of a Gaussian up to spectral error α\alpha with approximate differential privacy, one needs Ω~(d3/2αε+dα2)\tilde{\Omega}\left(\frac{d^{3/2}}{\alpha \varepsilon} + \frac{d}{\alpha^2}\right) samples for any αO(1)\alpha \le O(1), which is tight up to logarithmic factors. This improves over previous work which established this for αO(1d)\alpha \le O\left(\frac{1}{\sqrt{d}}\right), and is also simpler than previous work. Next, we prove that for estimating the mean of a heavy-tailed distribution with bounded kkth moments with approximate differential privacy, one needs Ω~(dαk/(k1)ε+dα2)\tilde{\Omega}\left(\frac{d}{\alpha^{k/(k-1)} \varepsilon} + \frac{d}{\alpha^2}\right) samples. This matches known upper bounds and improves over the best known lower bound for this problem, which only hold for pure differential privacy, or when k=2k = 2. Our techniques follow the method of fingerprinting and are generally quite simple. Our lower bound for heavy-tailed estimation is based on a black-box reduction from privately estimating identity-covariance Gaussians. Our lower bound for covariance estimation utilizes a Bayesian approach to show that, under an Inverse Wishart prior distribution for the covariance matrix, no private estimator can be accurate even in expectation, without sufficiently many samples.Comment: 23 page

    Optimal Time-Backlog Tradeoffs for the Variable-Processor Cup Game

    Get PDF
    The \emph{p p-processor cup game} is a classic and widely studied scheduling problem that captures the setting in which a pp-processor machine must assign tasks to processors over time in order to ensure that no individual task ever falls too far behind. The problem is formalized as a multi-round game in which two players, a filler (who assigns work to tasks) and an emptier (who schedules tasks) compete. The emptier's goal is to minimize backlog, which is the maximum amount of outstanding work for any task. Recently, Kuszmaul and Westover (ITCS, 2021) proposed the \emph{variable-processor cup game}, which considers the same problem, except that the amount of resources available to the players (i.e., the number pp of processors) fluctuates between rounds of the game. They showed that this seemingly small modification fundamentally changes the dynamics of the game: whereas the optimal backlog in the fixed pp-processor game is Θ(logn)\Theta(\log n), independent of pp, the optimal backlog in the variable-processor game is Θ(n)\Theta(n). The latter result was only known to apply to games with \emph{exponentially many} rounds, however, and it has remained an open question what the optimal tradeoff between time and backlog is for shorter games. This paper establishes a tight trade-off curve between time and backlog in the variable-processor cup game. Importantly, we prove that for a game consisting of tt rounds, the optimal backlog is Θ(n)\Theta(n) if and only if tΩ(n3)t \ge \Omega(n^3). Our techniques also allow for us to resolve several other open questions concerning how the variable-processor cup game behaves in beyond-worst-case-analysis settings.Comment: 40 pages, published in International Conference on Automata, Languages, and Programming (ICALP), 2022. Abstract abridged for arXiv submission: see paper for full abstract. Updated to acknowledge additional fundin

    A faster and simpler algorithm for learning shallow networks

    Full text link
    We revisit the well-studied problem of learning a linear combination of kk ReLU activations given labeled examples drawn from the standard dd-dimensional Gaussian measure. Chen et al. [CDG+23] recently gave the first algorithm for this problem to run in poly(d,1/ε)\text{poly}(d,1/\varepsilon) time when k=O(1)k = O(1), where ε\varepsilon is the target error. More precisely, their algorithm runs in time (d/ε)quasipoly(k)(d/\varepsilon)^{\mathrm{quasipoly}(k)} and learns over multiple stages. Here we show that a much simpler one-stage version of their algorithm suffices, and moreover its runtime is only (d/ε)O(k2)(d/\varepsilon)^{O(k^2)}.Comment: 14 page

    Improved Diversity Maximization Algorithms for Matching and Pseudoforest

    Full text link
    In this work we consider the diversity maximization problem, where given a data set XX of nn elements, and a parameter kk, the goal is to pick a subset of XX of size kk maximizing a certain diversity measure. [CH01] defined a variety of diversity measures based on pairwise distances between the points. A constant factor approximation algorithm was known for all those diversity measures except ``remote-matching'', where only an O(logk)O(\log k) approximation was known. In this work we present an O(1)O(1) approximation for this remaining notion. Further, we consider these notions from the perpective of composable coresets. [IMMM14] provided composable coresets with a constant factor approximation for all but ``remote-pseudoforest'' and ``remote-matching'', which again they only obtained a O(logk)O(\log k) approximation. Here we also close the gap up to constants and present a constant factor composable coreset algorithm for these two notions. For remote-matching, our coreset has size only O(k)O(k), and for remote-pseudoforest, our coreset has size O(k1+ε)O(k^{1+\varepsilon}) for any ε>0\varepsilon > 0, for an O(1/ε)O(1/\varepsilon)-approximate coreset.Comment: 27 pages, 1 table. Accepted to APPROX, 202

    POSE ESTIMATION FOR ROBOTIC DISASSEMBLY USING RANSAC WITH LINE FEATURES

    Get PDF
    In this thesis, a new technique to recognize and estimate the pose of a given 3-D object from a single real image provided known prior knowledge of its approximate structure is proposed. Metrics to evaluate the correctness of a calculated pose are presented and analyzed. The traditional and the more recent approaches used in solving this problem are explored and the various methodologies adopted are discussed. The first step in disassembling a given assembly from its image is to recognize the attitude and translation of each of its constituent components - a fundamental problem which is being addressed in this work. The proposed algorithm does not depend on uniquely identifiable 3D model surface features for its operation - this makes it ideally suited for object recognition for assemblies. The algorithm works well even for low-resolution occluded object images taken under variable illumination conditions and heavy shadows and performs markedly better when these factors are removed. The algorithm uses a combination of various computer vision concepts such as segmentation, corner detection and camera calibration, and subsequently adopts a line-based object pose estimation technique (originally based on the RANSAC algorithm) to settle on the best pose estimate. The novelty of the proposed technique lies in the specific way in which the poses are evaluated in the RANSAC-like algorithm. In particular, line-based pose evaluation is adopted where the line chamfer image is used to evaluate the error distance between the projected model line and the image edges. The correctness of the computed pose is determined based on the number of line matches computed using this error distance. As opposed to the RANSAC algorithm where the search process is pseudo-random, we do an exhaustive pose search instead. Techniques to reduce the search space by a large amount are discussed and implemented. The algorithm was used to estimate the pose of 28 objects in 22 images, where some images contain multiple objects. The algorithm has been found to work with a 3-D mismatch error of less than 2.5cm in 90% of the cases and less than 1cm error in 53% of the cases in the dataset used
    corecore